Skip to content

[fix](pipeline) fix comment-trigger workflow crash due to GitHub API rate limit#63304

Open
hello-stephen wants to merge 1 commit into
masterfrom
fix/pipeline-trigger-rate-limit
Open

[fix](pipeline) fix comment-trigger workflow crash due to GitHub API rate limit#63304
hello-stephen wants to merge 1 commit into
masterfrom
fix/pipeline-trigger-rate-limit

Conversation

@hello-stephen
Copy link
Copy Markdown
Contributor

Problem

When a PR author posts a second run buildall comment within a short window, the
comment-to-trigger-teamcity workflow can silently fail, leaving TeamCity
not triggered even though the Actions run appears to have executed.

Root cause is two connected bugs:

Bug 1 — Unauthenticated GitHub API calls hit rate limits

_get_pr_changed_files_count and _get_pr_changed_files in
regression-test/pipeline/common/github-utils.sh make anonymous curl requests
(no Authorization header). GitHub's anonymous rate limit is 60 req/h per IP.
GitHub Actions runner IPs are shared across every workflow in the org, so hitting
the limit is common when many workflows run concurrently.

Bug 2 — Missing all_files crashes step 5

When _get_pr_changed_files fails (10 retries exhausted), step 4 ("Check if pr
need run build") has a correct fallback that defaults everything to trigger-all —
but it does not create the all_files file.

Step 5 ("Check for sensitive pipeline script changes") then executes:

done < all_files   # bash: all_files: No such file or directory → exit 1

This makes step 5 fail, which causes all downstream TeamCity trigger steps to be
skipped. The workflow shows conclusion: failure with no useful message to the PR
author.

Fix

regression-test/pipeline/common/github-utils.sh — add optional auth header
to both curl calls using the shell idiom ${GITHUB_TOKEN:+-H "Authorization: Bearer ${GITHUB_TOKEN}"}. When GITHUB_TOKEN is set (always the case in GitHub
Actions), the authenticated rate limit of 5 000 req/h is used. When unset
(local manual usage), the flag expands to nothing and behaviour is unchanged.

.github/workflows/comment-to-trigger-teamcity.yml — add an early-exit guard
in step 5 before reading all_files. If the file is absent (because the API call
in step 4 failed), the step exits 0 with an explanatory warning instead of
crashing. The previous step already defaulted to trigger-all in this scenario, so
no functionality is lost.

Reproduction

Observed on run 25771630114
triggered by PR #63110 comment #63110 (comment).

…te limit causes missing all_files

Two issues fixed:
1. `_get_pr_changed_files_count` and `_get_pr_changed_files` in github-utils.sh
   made unauthenticated curl requests, hitting GitHub's 60 req/h anonymous rate
   limit on shared Actions runner IPs. Added `${GITHUB_TOKEN:+-H "Authorization:
   Bearer ${GITHUB_TOKEN}"}` so requests use the 5000 req/h authenticated limit
   when GITHUB_TOKEN is available (no-op when unset, preserving local usage).

2. When `_get_pr_changed_files` fails, step 4 falls back to trigger-all but does
   NOT create the `all_files` file. Step 5 then crashes with "No such file or
   directory" on `< all_files`, causing conclusion=failure and skipping all
   TeamCity trigger steps. Added an early-exit guard in step 5 to handle this
   gracefully.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@hello-stephen
Copy link
Copy Markdown
Contributor Author

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor Author

TPC-H: Total hot run time: 30898 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e9cff2ea4400bd0bfad08de1adeec7775712d99e, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17934	3894	3828	3828
q2	q3	10765	1354	797	797
q4	4684	473	343	343
q5	7619	2266	2074	2074
q6	238	182	137	137
q7	911	774	652	652
q8	9355	1682	1483	1483
q9	6454	4865	4946	4865
q10	6448	2109	1807	1807
q11	442	275	250	250
q12	694	433	293	293
q13	18245	3349	2771	2771
q14	267	251	234	234
q15	q16	817	790	711	711
q17	993	903	911	903
q18	6881	5735	5541	5541
q19	1249	1237	1084	1084
q20	582	429	272	272
q21	5981	2822	2526	2526
q22	453	368	327	327
Total cold run time: 101012 ms
Total hot run time: 30898 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4560	4518	4754	4518
q2	q3	4770	5171	4690	4690
q4	2202	2206	1392	1392
q5	4811	4723	4628	4628
q6	232	178	130	130
q7	1889	1771	1536	1536
q8	2319	1894	1886	1886
q9	7241	7201	7144	7144
q10	4523	4392	3991	3991
q11	530	379	353	353
q12	709	724	505	505
q13	2972	3296	2794	2794
q14	293	296	263	263
q15	q16	681	701	611	611
q17	1239	1241	1215	1215
q18	7288	6730	6787	6730
q19	1100	1073	1105	1073
q20	2206	2204	1923	1923
q21	5313	4624	4486	4486
q22	527	454	420	420
Total cold run time: 55405 ms
Total hot run time: 50288 ms

@hello-stephen
Copy link
Copy Markdown
Contributor Author

TPC-DS: Total hot run time: 168894 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e9cff2ea4400bd0bfad08de1adeec7775712d99e, data reload: false

query5	4317	639	496	496
query6	335	230	211	211
query7	4219	571	311	311
query8	333	232	214	214
query9	8846	3963	3954	3954
query10	439	341	301	301
query11	5804	2403	2189	2189
query12	190	130	134	130
query13	1345	629	429	429
query14	6374	5368	5046	5046
query14_1	4360	4355	4343	4343
query15	215	202	186	186
query16	1000	493	441	441
query17	1166	753	606	606
query18	2626	492	361	361
query19	216	209	171	171
query20	142	132	130	130
query21	215	139	120	120
query22	13646	13542	13378	13378
query23	17124	16342	15987	15987
query23_1	16197	16239	16063	16063
query24	7650	1774	1313	1313
query24_1	1300	1296	1303	1296
query25	570	503	445	445
query26	1305	313	173	173
query27	2740	553	341	341
query28	4440	1939	1932	1932
query29	996	642	506	506
query30	308	240	203	203
query31	1120	1062	957	957
query32	86	77	75	75
query33	550	359	305	305
query34	1165	1164	638	638
query35	782	792	659	659
query36	1380	1347	1235	1235
query37	154	108	134	108
query38	3215	3150	3051	3051
query39	925	925	897	897
query39_1	884	881	867	867
query40	222	146	125	125
query41	68	66	63	63
query42	110	109	107	107
query43	321	322	278	278
query44	
query45	207	198	194	194
query46	1049	1157	749	749
query47	2347	2374	2254	2254
query48	413	407	290	290
query49	635	485	382	382
query50	1002	333	260	260
query51	4498	4338	4260	4260
query52	108	108	97	97
query53	257	288	215	215
query54	322	295	255	255
query55	94	92	87	87
query56	300	312	315	312
query57	1481	1437	1341	1341
query58	302	277	265	265
query59	1585	1623	1426	1426
query60	339	321	313	313
query61	163	158	149	149
query62	668	620	572	572
query63	238	202	204	202
query64	2385	795	636	636
query65	
query66	1702	482	362	362
query67	30071	29951	29815	29815
query68	
query69	467	333	318	318
query70	1073	1013	970	970
query71	301	278	264	264
query72	2965	2674	2014	2014
query73	834	783	411	411
query74	5069	4896	4774	4774
query75	2675	2578	2278	2278
query76	2288	1142	752	752
query77	383	404	334	334
query78	11991	12319	11643	11643
query79	1478	1014	762	762
query80	636	565	451	451
query81	448	277	236	236
query82	1385	160	126	126
query83	355	271	255	255
query84	257	142	114	114
query85	883	547	452	452
query86	393	346	318	318
query87	3404	3402	3223	3223
query88	3483	2640	2652	2640
query89	445	375	337	337
query90	1886	178	182	178
query91	180	166	147	147
query92	80	78	70	70
query93	1545	1364	946	946
query94	543	349	316	316
query95	697	477	343	343
query96	1037	805	321	321
query97	2722	2727	2581	2581
query98	236	235	243	235
query99	1110	1115	998	998
Total cold run time: 253824 ms
Total hot run time: 168894 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant